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POISSON-DIRICHLET STATISTICS FOR THE EXTREMES OF A 
LOG-CORRELATED GAUSSIAN FIELD 

LOUIS-PIERRE ARGUIN AND OLIVIER ZINDY 



^_j Abstract. We study the statistics of the extremes of a discrete Gaussian field with logarithmic 

(T^ correlations at the level of the Gibbs measure. The model is defined on the periodic interval [0,1]. It is 

^^ based on a model introduced by Bacry and Muzy |3] , and is similar to the logarithmic Random Energy 

j^ Model studied by Carpentier and Le Doussal [14] and more recently by Fyodorov and Bouchaud 

^) [23] . At low temperature, it is shown that the normalized covariance of two points sampled from the 

Gibbs measure is either or 1. This is used to prove that the joint distribution of the Gibbs weights 

converges in a suitable sense to that of a Poisson-Dirichlet variable. In particular, this proves a 

conjecture of Carpentier and Le Doussal that the statistics of the extremes of the log-correlated field 

-y^ behave as those of i.i.d. Gaussian variables and of branching Brownian motion at the level of the 

n , Gibbs measure. The proof is based on the computation of the free energy of a perturbation of the 

_3 model, where a scale-dependent variance is introduced, and on general tools of spin glass theory. 

-(— > 

B 

' — ' 1. Introduction 

J> This paper studies the statistics of the extremes of a Gaussian field whose correla- 

^ tions decays logarithmically with the distance. The model is related to the process 

psj introduced by Bacry and Muzy [3], and similar to the logarithmic Random Energy 

"^ Model or log-REM studied by Carpentier and Le Doussal [H], and Fyodorov and 

cn Bouchaud [23]. Another important log-correlated model is the two-dimensional dis- 

z:^ Crete Gaussian free field. The model studied here has the advantages of having a 

T-H graphical representation of the correlations and a continuous scale parameter, cf. Sec- 

j> tion 1.1, which might make the ideas of the proof more transparent. The method 

developed here is expected to hold for the two-dimensional discrete Gaussian free 
field. 



X 



The statistics of the extremes of log-correlated Gaussian fields are expected to 
resemble those of i.i.d. Gaussian variables or Random Energy Model (REM) and to 
a finer level, those of branching Brownian motion. In fact, log-correlated fields are 
conjectured to be the critical case where correlations start to affect the statistics of 
the extremes. The reader is referred to the works of Carpentier and Le Doussal [H] ; 
Fyodorov and Bouchaud [23]; and Fyodorov, Le Doussal and Rosso [24] for physical 
motivations of this fact. The analysis for general log-correlated Gaussian field is 
complicated by the fact that, unlike branching Brownian motion, the correlations do 
not necessarily exhibit a tree structure. 

The approach of this paper is in the spirit of the seminal work of Derrida and 
Spohn [18] who studied the extremes of branching Brownian motion using the Gibbs 
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measure. Even though correlations are not tree-hke for general log-correlated models, 
such fields can often be decomposed as a sum of independent fields acting on different 



scales. The main results of the paper are Theorem 1.4 on the correlations of the 



extremes and Theorem |1.5| on the statistics of the Gibbs weights. The results show 
that, in effect, the statistics of the extremes of the log-correlated field are the same as 
those of branching Brownian motion at the level of the Gibbs measure, as conjectured 
by Carpentier and Le Doussal [H]. The proof of the first theorem is based on an 
adaptation of a technique of Bovier and Kurkova [TOl [11] originally developed for 
hierarchical Gaussian fields such as branching Brownian motion. For this purpose, 
we need to introduce a family of log-correlated Gaussian models where the variance 
of the fields in the scale-decomposition depends on the scale. The free energy of the 
perturbed models is computed using ideas of Daviaud [16] . The second theorem on the 
Poisson-Dirichlet statistics of the Gibbs weights is proved using the first theorem on 
correlations and general spin glass theory results. The approach is robust, cf. Theorem 



2.5 , and could be of independent interest to prove Poisson-Dirichlet statistics for the 



extremes of other Gaussian fields. 

1.1. A log-correlated Gaussian field. Following [3], we consider the half-infinite 
cylinder 

C^ := {{x,y) ■ X e [0,1]^ , y eR\}, 
where [0, 1]^ stands for the unit interval where the two endpoints are identified. We 
write ||x — x'\\ := min{|x — x'\,l — \x — x'\} for the distance on [0, 1]^. 

The following measure is put on C^: 

9{dx, dy) := y~'^ dx dy. 

Note that 6 is invariant under homogeneous scaling {x,y) i— )■ (Ax, Xy). For cr > 0, the 
variance parameter, there exists a random measure /i on C~^ that satisfies: 

i) for any measurable set A in B{C^), the random variable fi{A) is a centered 
Gaussian with variance a^ 6' (A). 

ii) for every sequence of disjoint sets (v4„)„ in B{C~^), the Borel a-algebra associ- 
ated with C+, the random variables (/i(y4„))„ are independent and 



AiMl^n =^yU(^n), a.S. 



Let Q be the probability space on which /i is defined and let P be the law of /i. f2 is 
endowed with the cr-algebras J-'u generated by the random variables fi{A), for all the 
sets A at a distance greater than u from the x-axis. The reader is referred to [3] for 
the existence of the probability space {Q, (J-'u)u,P). 

The subsets needed for the definition of the Gaussian field are the cone-like subsets 
AJx)oiC+, 



'■u\ 



Au{x) :={{s,y)eC+ : y>u, ~f{y)/2 <s-x< f{y)/2], 

where f{y) = y ior y G (0, 1/2) and f{y) = 1/2 otherwise. See Figure [Ijfor a depiction 
of the subsets. Observe that, by construction, if ||x — x'\\ = i > u, then A„(x) and 
Au{x') intersect exactly above the line y = L 
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The Gaussian process w^ = [uu{x),x G [0, 1]^) is defined using the random measure 
/^, 
(1.1) uju{x) := /i(A„(x)), X G [0, 1]^. 

By the properties i) and ii) of /i hsted above, the covariance between C0u{x) and a;„(x') 
is given by the integral over 6 of the intersection of Au{x) and Au{x'): 



(1.2) E[io^{x)io^{x')] = / e{ds,dy). 

The paper focuses on a discrete version of Uu- Let N eN and take e = 1/N. Define 

the set 

_ ; 1 2 i N -1 

A'Ar-A',:-<!0,-,-,...,-,...,^^ 



The notation AV and X^ will be used equally depending on the context. For a given 
A^, the log-correlated Gaussian field is the collection of Gaussian centered random 
variables cOeix) for x G A'at: 

(1.3) X = {x,,xeXN) = {uJe{x),xeXN). 

A compelling feature of this construction is that a scale decomposition for X is easily 
obtained from property ii) above. Indeed, it suffices to write the variable X^ as a sum 
of independent Gaussian fields corresponding to disjoint horizontal strips of C~^. The 
y-axis then plays the role of the scale. 



The covariances of the field are computed from (1.2) by straightforward integration 
(see also Figure [I]). 

Lemma 1.1. For any < e = 1/N < 1/2, 



E[X'J = a'(log iV + 1 - log 2), xeX, 



N, 



E[X,X,>] = a2(log(l/||x-x'||)-log2), x j^ x' E Xn- 

Similar constructions of log-correlated Gaussian fields using a random measure on 
cone-like subsets are also possible in two dimensions, see e.g. 



1.2. Main results. Without loss of generality, the results of this section are stated 
for the variance parameter a = 1. The points where the field is unusually high, the 
extremes or the high points, can be studied using a minor adaptation of the arguments 
of Daviaud for the two-dimensional discrete Gaussian free field p^. We denote by 
1^1 the cardinality of a finite set A. 

Theorem 1.2. Let 

Hn{i) ■■= {xeXN:X,> v^7logiv} 
be the set of 'y-high points. Then for any < 7 < 1, 

log 17/^(7)1 T 2 ■ K KTf 

lim — — — = 1 — 7 , m probability. 

7V->oo log N 

Moreover, for all p > there exists a constant c = c{p) > such that 

P (|'H^(7)I < Ar(i-^')-^) < exp{-c(logAr)2}, 
for N large enough. 
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Figure 1. The two subsets ^^(x) and A^r^x') for e = 1/N. The vari- 
ance of the variables is given by the integral over 6{dt, dy) = y~'^dt dy of 
the lighter grey area above e = 1/N, and the covariance by the integral 
over the intersection of the subsets, the darker grey region. 

The technique of Daviaud is based on a tree approximation introduced by Bolthausen, 
Deuschel and Giacomin [5] for the discrete two-dimensional Gaussian free field. There, 
the technique is used to obtain the first order of the maximum. The same argument 



applies here. Theorem 1.2 and simple Gaussian estimates yield 



(1.4) 



lim 



logN 



v2, in probability. 



The important feature of Theorem 1.2 and Equation (1.4) is that they are identical 



to the results for A^ i.i.d. Gaussian variables of variance logA^. In other words, the 
above observables of the high points are not affected by the correlations of the field. 
The i.i.d. case is called the Random Energy Model (REM) in the spin glass literature. 



The starting point of the paper is to understand to which extent i.i.d. statistics is 
a good approximation for more refined observables of the extremes of log-correlated 
Gaussian fields. To this end, we turn to tools of statistical physics which allow for a 
good control of the correlations. 

First, consider the partition function Zi\f{/3) of the model (/3 stands for the inverse- 
temperature): 

ZN{f3):= 5^exp{/3Xj, V/3>0, 
and the free energy 

/7v(/3):=r^logZiv(/3), V/3>0. 
log A/ 



Theorem L2 is used to compute the free energy of the model. 
Corollary 1.3. Let (3^ := \/2. Then, for all (3 > 

■l+" 



/(/3) := lim /^(/3) 



2 , if /3 < /3c, 
V2/3, if /3 > /3„ 



s. and in L^ . 



;i.5) q(x,y) = q^^\x,y) := — , x,y e X, 



N- 
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The free energy is the same as for the REM with variance logA^. In particular, the 
model undergoes freezing above /3c in the sense that the quantity /(/3)//3 is constant. 

More importantly, consider the normalized Gibbs weights or Gibbs measure 

Gi3,n{x) := , „., X G Xn- 

By design, the Gibbs measure concentrates on the high points of the Gaussian field. 
The first main result of the paper is to achieve a control of the correlations at the level 
of the Gibbs measure. Precisely, with spin glasses in mind, we consider the normalized 
covariance or overlap 

logN 

Clearly, ||x — 1/|| = e'^^^'^^ and < q{x,y) < 1. Moreover, the overlap q{x,y) is equal to 
the normalized correlations E[X2:Xy]/E[X^] plus a term that goes to zero as N goes 
to infinity. 

A fundamental object, that records the correlations of high points, is the distribution 
function of the overlap sampled from the Gibbs measure. Namely, denote by Ga^ the 
product measure on AV x AV. Let (xi, X2) be two replicas sampled from G^j^. Write 
for simplicity gi2 for q{xi,X2)- The averaged distribution function of the overlap is: 

(1.6) x^^iq) := E [G^^% {qu < q}] , < q < 1. 

More generally, the product measure on s replicas {xi, ...,Xs) € X^ sampled from the 

s(s — l) 

Gibbs measure will be denoted by G2^. Let F : [0,1] 2 be a continuous function on 
the overlaps of s replicas, that is a function that depends smoothly on qw := q{xi,xi'), 
I 7^ /', for (xi, ..., Xs) G X^. We will write EG^^(F(gi;/)) for the averaged expectation 
of F when (xi, ...,Xs) is sampled from G^^. 

The first result is the analogue of results of Derrida and Spohn for the Gibbs measure 
of branching Brownian motion (see Equation (6.19) in [18j), of Chauvin and Rouault 
on branching random walks [U] and of Bovier and Kurkova on Derrida's Generalized 
Random Energy Models (GREM) [T7]. [In]. 



Theorem 1.4. For /3 > /3c, 

]imxf\q)=]imE[G-^%{q,2<q}] 



W/N ,.___ ^ r^x2 r_ ^ _Ti ] ^, forO<q<l, 



AT-s-oo ^ ^ ' N- 



for q = I. 



In other words, the theorem states that for large N, the only possible normalized 
correlations between high points are or 1. This had been conjectured for this type 
of Gaussian field by Carpentier and Le Doussal, see page 16 in ^1A\ . 

Similarly to [H], the control of the correlations is achieved by introducing a per- 



turbed version of the model, cf. Section 2.1 In the present case, the proof is more 



intricate since the structure of correlations of the Gaussian field for finite N is not 
tree-like or ultrametric as in the cases of branching Brownian motion and GREM's. 
For example, for branching Brownian motion, q{x, y) corresponds to the branching 
time of the common ancestor of two particles at time t, x and y, divided by t. Because 
of the branching structure, 

(1.7) the inequality g(x, y) > min{g(x, z), q{y, z)} is satisfied for all x, y, z. 
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(The terminology ultrametric comes from the fact that the distance induced by the 



form g(-, ■) is ultrametric when (1.7) holds.) The Parisi Ultrametricity Conjecture in 
the spin-glass literature states that, even though tree-like correlations might not be 
present for finite N, ultrametric correlations are recovered in the limit N -^ oo for a 
large class of Gaussian fields at the level of the Gibbs measure, that is: 

(1.8) lim E [Gfj^ {q^2 > min{gi3, fe}}] = 1- 



It is not hard to see that Theorem 1.4 implies the ultrametricity conjecture for the 
Gaussian field considered, since the overlaps can only take value or 1. (In the 
language of spin glasses, the field is said to admit a one-step replica symmetry breaking 
at low temperature.) 

The second main result is to describe the entire joint distribution of overlaps sam- 
pled from the Gibbs measure. For the purpose of the statement, we recall the definition 
of a Poisson-Dirichlet variable. For < a < 1, let rj = [rji, i E N) he the atoms of a 
Poisson random measure on (0, oo) of intensity measure s~"~^ ds. A Poisson-Dirichlet 
variable ^ of parameter a is a probability measure on the space of decreasing weights 
s = [si,S2, ■ ■ ■) with 1 > si > S2 > ■ ■ ■ > and Yli Si < ^ which has the same law as 




where J, stands for the decreasing rearrangement. 

Theorem 1.5. Let /3 > /3c and ^ = {C,k,k G N) 6e a Poisson-Dirichlet variable of 
parameter (3c/ f3. Denote by E the expectation with respect to C,. For any continuous 

s(s — l) 

function F : [0, 1] 2 — t- M 0/ the overlaps of s replicas: 



N 



lim E [G|>(F(g,0)] = ^ 



.kien,...,ks&i 



Essentially, the theorem shows that the Gibbs weights of the high points converge 
in law to a Poisson-Dirichlet variable. However, it is important to stress that, as in 
the case of branching Brownian motion (and unlike the REM), it is not the collection 
{Gfi^N{,x),x G Xn)^ per se that converges to a Poisson-Dirichlet variable. This is 
because the continuity of the function F naturally identify points x, y for which q{x,y) 
tends to 1 in the limit N -^ 00. Rather, the result shows that the Poisson-Dirichlet 
weights are formed by the sum of the Gibbs weights of high points that are arbitrarily 
close to each other. 



1.3. Relation to Previous Results. Bolthausen and Kistler have studied a family 
of models called generalized GREM's for which the correlations are not ultrametric 
O [H] for finite A^. By construction, the overlaps of these models can only take a 
finite number of values (uniformly in N, the number of variables). They compute the 
free energies and the Gibbs measure and prove the Parisi ultrametricity conjecture 
for these. Bovier and Kurkova [TOl [TT] have obtained the distribution of the Gibbs 
measure for Gaussian fields, called the GREM's, where the values of the overlaps 
are not a priori restricted. Their analysis is restricted to models with ultrametric 
correlations and include the case of branching Brownian motion. 
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The works of Bolthausen, Deuschel and Zeitouni [B], Brainson and Zeitouni ^2] and 
Ding [in] establish the tightness of the recentered maximum of the two-dimensional 
discrete Gaussian free field. We expect that their method can be applied to the 
Gaussian field we consider. 

We note that Fang and Zeitouni [22] have studied a branching random walk model 
where the variance of the motion is time-dependent. This model is related to the 
simpler GREM model of spin glasses and to the CREM of Bovier and Kurkova. The 
family of log-correlated Gaussian fields introduced in Section 2.2 is akin to these 
hierarchical models, where the scale parameter replaces the time parameter. 

2. Outline of the Proof 

2.1. A family of perturbed models. In this section, we define a family of Gaussian 
fields for which the variance parameter a is scale-dependent. It can be seen as the 
GREM analogue for the non-hierarchical Gaussian field considered here. We restrict 
ourselves to the case where a takes three values, which is the one needed for the proof 



of Theorem |1.4[ However, the construction and the results can hold for any finite 
number of values. 

Fix e = 1/N. We introduce a scale (or time) parameter t by defining for any 

te[o,i], 

Xx{t) := Uirt^x), X G Afe. 

Observe that for any fixed x, the process (X^(t))o<i<i has independent increments 
and is a martingale for the filtration (J^£t,t > 0): 

E[X,(t)|j;.]=X,.(s), fort>s. 

This is a consequence of the defining property ii) of the random measure /z. 

The parameters of the family of perturbed models are ex = (ai, a2, 1), where < 
«! < ^2 < 1 and (T = (o"i, (72, (Js) with ctj > 0, i = 1, 2, 3. For the sake of clarity and 
to avoid repetitive trivial corrections, it is assumed throughout the paper that A^"^, 
A^°2-ai^ and iV^-^^ are integers. The Gaussian field Y^'''"\t) = {Y^^'^'"\t),x G Xe) is 
defined from the field X as follows 
(2.1) 

r o-iX,(t), if < t < ai, 

y^'^'^^Kt) = { ^i^x(ai) + ^2 (X,(t) - X,(ai)) , if ai < t < ^2, 

[ aiX^(ai) + a2 (X^(a2) - X^(ai)) + ^3 {X^{t) - X^{a2)) , if ^2 < t < 1. 

The construction is depicted in Figure 2, We write Y^"''"' for the field (y^ (l),a; G 
A'g). The dependence on a and a will sometimes be dropped in the notation of Y for 
simplicity. 

Consider the partition function Z^ (/3) of the perturbed model 
(2.2) 4"'"H/3):= $^exp{/3y4. 



xSA-A 



and the free energy. 



fr\&) ■■= 1^^ log4-'")(/3), V /3 > 0. 

The log number of high points can be computed for the Gaussian field Y using 
Daviaud's technique recursively. The free energy is then obtained by doing an explicit 
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Figure 2. The cone associated with the process Yx{-). 

sum on these high points. This is the object of Section |3] and Section |4j We only write 
the results for the two cases needed for Theorem |1.4| as will be explained in Section 
2.2. The result is better expressed in terms of the free energy of the REM with A^ 
i.i.d. Gaussian variables of variance cr^logA^: 



f{(3;cT' 



P^a^ 



1 + 



v^a/3, 



2 ' 



if /3 < (3,{a^) 
if /3 > /3e(a2) 



V2 



Corollary 1.3 from the next result with the choice cri = 0"2 = cts. 



Proposition 2.1. Let V12 := cr^ai + cr|(«2 — cti); o-nd V23 := 0"|(a2 — «i) + o"|(l — a;2). 
Then: 



Case 1: if ai < 02 and -^ > 



a 



3) 



K'T.")/ 



hm f^'">(p) = a2f{P;—) + il 



Af-5-oo 



Ci2 ' 



a2)f{(3;al), 



Case 2: if ai > 02, o"2 < o"3j and o"f > j 



V23 



-ai 



hm /ir'°)(/3) = ai/(/3; a?) + (1 - ai)/(/3; 



V^. 



23 



7V-s>oo 



1 — ai' 



where the convergence holds almost surely and in L^ . 

The expressions are identical to the free energy of a GREM with three levels where 
the parameters ctj fails to satisfy monotonicity conditions and is reduced to a GREM 
with two effective levels. The conditions are more easily understood by defining a 
piecewise linear function of slopes a^, a"^ and a^ on the intervals [0, ai], [ai, 02], [^2, 1] 
respectively. In the two cases above, this functions fails to be concave. However, it 
is easily verified that the effective parameters define the concave hull of the function. 
The reader is referred to [13] and [10] for more details on the concavity conditions. 
Moreover, in both cases, there are two critical values for /3 corresponding to the 
respective /5c(c^) of the two effective parameters a^. In Case 1, the two critical /3's 
are \/2a2/Vi2 and y2/a|, whereas they are ^2/a\ and yj2[l — ai)/V23 in the Case 
2. 
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2.2. The Bovier-Kurkova technique. The proof of Theorem 1.4 rehes on deter- 
mining the overlap distribution of the original model from the free energy of the 
perturbed ones. This approach has been used by Bovier and Kurkova in the case of 
the GREM-type models ^^. 

For u G (—1, 1) and t,S E (0, 1) such that t + 6 < 1, consider the field {Yx,x G A^) 
defined in ([2.1[ ) with the choice of parameters cr = (1, (1 + m), 1) and ct = {t,t + 6,1), 
see Figure pi (Again, for the sake of clarity, it is assumed that A^*, A^'' and A^^~(*+^) 
are integers.) The original Gaussian field (X^) is recovered at m = 0. Note that if 



u > 0, the parameters correspond to the first case of Proposition |2.1| and if m < 0, to 
the second. The field Y can also be represented as follows: 

(2.3) Y, = X, + u{X,{t + 6)-X,{t)), l<i<N. 

The proof of the next lemma is a simple integration and is postponed to the Appendix, 
see Section [5l2l 

Lemma 2.2. Fix < e = 1/N < 1/2, and t,6 e (0,1) such that t + 6 < 1. Let 
X^ := X^{t + S) - X^{t). Then, for x E X^ 

E[Xl]=E[X^X.,]=6\ogN, 

and, for x,x' G X^, 

' 6\ogN + 0{l), 
iq{x,x')-t)logN + 0{l), 
0, 



(2.4) 



IE[Xa,X^ 



X G Xe, 

if t + 6 <q{x,x') < 1, 
if t < q{x, x') <t + 6, 
if < q{x,x') <t. 



where we recall that \\x 



X 




Figure 3. The perturbed model where the variance parameter is (1+u) 
on the strip [£*"'"'',£:*] where e = 1/N. 

This result together with a Gaussian integration by part yield an important lemma. 
Lemma 2.3. For all t,6 E (0, 1), such that t + 6 < 1, we have 



/3 



t+5 



X 



(N), 



s)ds + OAr(l) 



logA^ 



:E 



J2GpA^)iMt + 5)-X,{t)) 



xdXe 
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where 0^(1) stands for a term that goes to as N goes to oo. 

Proof. Fix e = 1/N, t and 6. Note that (X^.; {X^/,x' G X^)) is a Gaussian vector of 



A^ + 1 variables. Therefore, Gaussian integration by part (see Lemma 5.3) yields, for 
all X G Al, 



(3-^E 



X,e^^^ 






-J2e \x,x, 



E 



x'eXe 

+E pTa-Xa; 



MX.+X^,) 



{Y^.^.y^'^y 



E 



^px^ 



LE 



zex. 



^px. 



Lemma 2.2 and elementary manipulations imply 



{l3\ogN)-^E 

ft+5 

x,x'&X^ 
-t+5 



y^ x^Gii^N{x) 



.X&Xe 



'^{q{x,x')<s}<^S I E [Gp^n{x)Gp^n{x)\ + O 



1 



logiV 



E [G}% {gi2 < s}] ds + 



which concludes the proof of the lemma. 



logA^y ' 



D 



Proof of Theorem L4_. Fix (3 > (3c = a/2. Write Zj^' ' (/3) for the partition function 
(2.2 ) for the choices cr = (1, (1 + u), 1) and a = {t,t + 6,l). Direct differentiation and 



Equation (2.3) give 



du 



Eiogz};-*'') 



(/3) 



M=0 



/3E J] (X,(t + ,5) - X,(t))G^,^(x), 



x£Xe 



which, together with Lemma [2. 3[ yields 

ft+S 



(2.5) 



.(N) 



x'^'\s)ds = r i^og NY 



id 



Observe that E/ 



(u,t,(5) 

N 



over, by Proposition 



la 



2.1 



^ .Elog4"'*'')(/3)) +o;v(l). 
du V / u=o 

7{u,t,5) 



(logA^) ^ElogZj^' ' {(3) is a convex function of u. More- 



Ef}!^'^'^\l3) converges. Write /("'*'^)(/3) for the limit. Recall 

that the expression for f^^'^'^\(3) depends on the sign of u and of course on (3. Con- 
vexity in u implies that 
(2.6) 

d 



d 
N^oo du '' ^^ ^' ' du' 
We show the function is differentiable at m = 0. The derivative can be computed by 



lim ^E/^"'*'^^(/3) = ^/("'*'^n/3) for any u where u ^ /("'*'^)(/3) is differentiable. 



Proposition 2.1 For u small enough, (3 is larger than the two critical /3's. Thus 



{l+u){t+5)5 



du 



f 



{u,t,5) 



m 



V2f3- . 

^^(t+<5)(t+(l+«)25)' 
/o^ (l+«)(l-f)^ 



if M > 0, 
if M < 0. 



V(l-t)((l+«)25+l-(t+5))' 

From this, it is easily verified that /("'*'^)(/3) is differentiable at m = and 



(2.7) 



du V 



f/^"'*''H/3) 



M=0 



V2^5. 
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Equations (2.5), (2.6), and (2.7) together imply 

rt+5 /o 

(2.8) lim / xf\s)ds = -^6 , for all t,S e (0, 1), with t + S < 1. 

^^°° Jt P 

This shows weak convergence of the sequence (x^ jat to the distribution function 
of the random variable taking values and 1 with respective probability v2//9 and 
1 — -\/2//3. Indeed, suppose the convergence does not hold. Since (x^ ) at is tight, there 
must exist a subsequence that converges weakly to a distribution function x^ where, 
for some sq G (0,1), a;^(so) > "\/2//3 or a;^(so) < "\/2//3- But X/j is non-decreasing. 



so (2.8) must be violated for some t > Sq or t < sq in the limit N -^ oo for 5 small 



enough. This concludes the proof of Theorem 1.4 D 



2.3. A spin-glass approach to Poisson-Dirichlet variables. In this section, the 



link between Theorem L4 and Theorem |1.5| is explained. The technique, inspired from 
the study of spin glasses, is general and is of independent interest to prove convergence 
to Poisson-Dirichlet statistics. 

The first step is to find a good space for the convergence of the random measure 
Gp^N- To this aim, note that the collection of functionals EG^^ [F(g;;/)] over all s G N 
and all continuous functions on the overlaps of s replicas determine the law of a N x N 
random matrix, say i?*^^^ = (i?|;, )//'gN through the identity: 

R\ii is the overlap of the Z-th and /'-th points sampled from Gp^N- We write E for 
the expectation of the law of R^^\ R^^^ is a covariance matrix almost surely and has 
only I's on the diagonal. Moreover, since each point is sampled independently from 
the same measure, its law is weakly exchangeable, that is for any permutation tt of a 
finite number of indices: 

y^n{l)TT{l')J ~y^ii' J- 
It is not hard to see that the laws of random covariance matrices with I's on the di- 
agonal and with this above symmetry form a compact space under the weak topology 
induced by the convergence of expectation of the continuous functions F on s replicas, 
s G N. This space is called the space of Random Overlap Structures in [2]. In par- 
ticular, there exists a subsequence {R^^"^^} that converges. Denote the limit random 
matrix by R. Since R is also weakly exchangeable, it is constructed by sampling from 
a random measure exactly as for R^'^^ by a representation of Dovbysh and Sudakov 
[20]. Precisely, there exists a random probability measure, say fj,^, on a Hilbert space 
H, say £^(N), such that for any continuous function F on s replicas: 

(2.9) lim EG^;„ [F(g,0l = F[F(i?,0] = Ef^y[Fivi ■ v[)]. 



>oo 



In the above notation, s vectors of Ti are sampled independently from fi^. The inner 
product between the /-th and /'-th copy is denoted by vi ■ v'l. E is the expectation 
on the random measure fijj. Note that, since q{x,x) < 1, the random measure /^/^ is 
supported on the unit ball. 
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The first consequence of Tlieorem 1.4 is tliat for any limit /i^ of a converging 
subsequence: 

(2.10) E [^if {v, ■ V2 < q}] = Jim E [Gf^ {q^2 < Q}] = J ko,i){q) + l{i}(g)- 

(The first equahty is obtained by bounding l[o^g ](g»/) by continuous functions on two 



rephcas above and below and by applying (2.9).) Equation (2.10) implies that ^p is 
an atomic measure. 

Corollary 2.4. // a subsequence of (G^^n) converges weakly to ^p in the sense of 



(2.9), then there exist random orthonormal vectors {ei]i G N) C Ti, i.e. such that 



Ci ■ Cj = 6ij; and random weights ^ = {^f, i G N)^, with ^i > 0, ^^^^q^i = 1 such that: 

fj.f} = ^^i Se„ P-a.s. 



Moreover, from ( [2lo| , ^EieN^^] = 1 - - 



- 1 _ L 



Proof, /i/3 is a random probability measure on the unit sphere of "H. Fix a realization 
of /i^. Let Se be a ball in "H of radius e such that /i^(-Be) > 0. Let (yi^l G N) be 
iid vectors of "H sampled from ^p. There must be an infinite number of vectors of 



this sequence in S^ by the Borel-Cantelli lemma 2. On the other hand by (2.10) 
the only possible values for vi ■ vi> is or 1, P-a.s. By taking e small enough, this 
shows that vi ■ vi> = 1 for every vector sampled from B^. Thus the f;'s sampled from 
Sg are all equal showing that if Hpi^B^) > 0, e small enough, there exists a unique 
vector eo G B^ such that /i{eo} = yu(-Be)- Since this holds for any B^, we conclude 
there exists a countable (maybe finite) collection {cj} C 1-i such that /i^{ej} > 0. 
Moreover, Cj ■ Cj = if i 7^ j since vi ■ f ;/ = or 1, P-a.s. for the sequence of i.i.d. 
vectors sampled from ^p. D 



To finish the proof of Theorem 1.5 , it remains to show that the random weights ^ are 



distributed like a Poisson-Dirichlet variable of parameter 1^. In fact, the parameter 
is already determined by Corollary |2.4[ , since for a Poisson-Dirichlet variable ^' of 
parameter x, -E[^^(^^)^] = 1 — x holds, see e.g. Corollary 2.2 in [29]. This will also 



imply that for any converging sequence of {G(s^n) in the sense of (2.9), the limit is the 
same. In particular, it implies convergence of the whole sequence by compactness. 

To prove the Poisson-Dirichlet statistics of the weights ^, we use the following 
characterization theorem of the law, see |30j| p. 22 for details. Define for all m G N 
the joint moments of the weights 

(2.11) Sin,,...,n^) = E J^ C'-C for m, ...,n^ > 1. 

The collection of S{ni, ..., n^), m G N, determines the law of a random mass-partition, 
that is a random variable on ordered sequences 1 > ti > r2 > ■ ■ ■ > with XIjen ^^ — 
1. If ^ is a Poisson-Dirichlet variable, it is shown in [30j, Proposition 1.2.8, that the 
moments satisfy the recursion relations: 

S{ni + 1, ..., ra„) = S{ni, ...,n^) + — S{ni, ..., n^) 

(2.12) ^ m ^ 

+ >, —S{ni + ni,n2,...,ni_i,ni^i,...,nm), 
^-^ s 

2<l<m 
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where s = rii + ... + Um- It is not hard to verify that all moments S{ni, ...,nm) (and 
thus the law of ^) are determined by recursion from S{2) and the identities (2.12). 



It turns out that these identities are satisfied by ^ defined by Corollary 2.4 



Theorem 2.5. Let ^ he a random mass-partition satisfying the assumptions of Corol- 
lary 2.4' The moments S{ni, ...,nm) of ^ satisfy (2.12) for any ?Ti G N and any 
ni, . . . , rim £ N. In particular, ^ has the law of a Poisson-Dirichlet variable of param- 
eter I- S{2). 

Proof. The identities are a general property of the Gibbs measure {Gp^jq{x) , x G X^) 
of centered Gaussian fields known as the Ghirlanda-Guerra identities. They were 
introduced in [25j. It is shown in |27J that, for any (5 where the free energy /(/3) is 
differentiable, the following concentration holds: 

(2.13) hm ^^^Gp,n{\X^,-^Gp.n{X^:, 



N^oo log N 



0. 



Note that by Corollary L3, differentiability holds at all /3 for the Gaussian field 
considered. Let F be a continuous function on the overlaps of s replicas. Observe 
that ( |2.13 ) and Cauchy-Schwartz inequality imply 
1 



EGy^iX^,F{qu>)) - EG^,NiX.jEG;%{F{qi,)) = 0. 



(2.14) lim , ^^ 

^ ^ TV log A^ 

The two terms can be evaluated by Gaussian integrations by part, see Lemma 5.3 



(2.15) 

and 
(2.16) 



(3logN 
1 



IEG/3,Ar(Xa;^^ 



1 



EG?2 



+ 



\ogN 



EG^;^(X.,F(g,0) 



f3 log N 

~sEG^^^^\qi,s+iF{qi,))+ J] EG^%{q,kF{qi,)) + O 



l<k<s 



logN 



Finally recalling (2.14) and assembling (2.15)-(2.16) yields the Ghirlanda-Guerra 
identities (see Equation (16) in [25j): 

qi^s+i F{qu 

1 



^G-,T 



(2.17) 



-EG^^^ 



qi2 



EG^% 



F{qu> 



J^EG^f^L, F(g, 



fc=2 



+ 0^(1). 



(Note that the term for /c = 1 cancels with the 1 since gn = 1 + 0]y{l).) 

In particular, for any converging subsequence of (G/3,Ar)Ar in the sense of (2.9), one 



obtains by Corollary 2.4 
(2.18) 



E\ 2_^ C.ki- ■ -^ks+i hiks+iF{Skiki,] 



Vl,...,Ks + l 



^fE^^l^f E ^k,---^k.HKk,)]+-J2^\ E ^■■■uSk.k.Hh^k,) 



Ki ,... ,/Cs 



r=2 



■ ("jfi 



To deduce ( p7L2| ) from ( |2J8| ), we follow ([30], pages 24, 25). The set {l,...,s} 
can be decomposed into the disjoint union of sets Ji, 



,Im with \Ij\ 



Uj for all 
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1 < J < m. Consider the functions (-Fj)i<j<m given by Fj{5kiki,) ■= Ilfc, k„er Kh' ^^"^ 



define F := IlKiXm-^i- Then, elementary manipulations imply (2.12). Note that the 
second term on the right side of (2.18) yields the last two terms of (2.12). D 

3. High points of the perturbed models 

In this section, the log-number of high points at a given level is computed for the 
perturbed models introduced in Section [2j The focus is on the two cases described 
in Theorem 2.1, though the technique applies to any perturbed model with a finite 
number of parameters. The free energies of the models are computed in Section |4j 

Let Y = {Yx,x G A^) be the Gaussian field introduced in Section 2.1 Recall the 
notation and the two choices of parameters {cr,(x) in Proposition 2.1 

/"IT ^ ^12 ^ 2 

Case 1: ai < a2, — > ag; 

(3.1) "^ y 

/-I o \ ^ 2 \ ^23 

Case 2: o"i > a2, 02 < cts? (^i ^ 



1 — ai 

Define also as before V12 := crjai + cr|(Q;2 — ai), V23 := erf (0:2 — ^i) + cr|(l — 02), and 
V123 := crfai + al{a2 - ai) + cr|(l - a2). 

Proposition 3.1. 



N 

where 



lim P ( maxF, > V^^iraax^ogN] = 0, 
v^— >oo \xeXi, J 



_ , X _ J v^V^2a2 + 0-3(1 -"2), for Case 1; 

I aiai + a/ v23(1 — ttij, for Case 2. 



Proposition 3.2. Let "^^(7) := {a; G A'^ : Yx > a/27 log A^} be the set of 'j-high 
points. Then, for all < •y < •ymax, 

lim ^"g'^^j^^' =g(---)(7), 
where in Case 1: 



in probability, 




^/^<^123^^, 
a|(l-a2) ' y7>V/l23Wv^, 



Moreover, for any £ < £^"''"\'j), there exists c such that 

P (|H^(7)I < N') < exp{-c(logiV)2}. 

The two propositions will be proved for Case 1, the reasoning for Case 2 being iden- 
tical. 
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3.1. Proof of Proposition 3.1 The idea is to construct a Gaussian field witli liier- 
archical correlations that dominates Y at the level of the covariances. The result will 
follow by comparison using Slepian's lemma. The same field will be used in the proof 



of the upper bound in Proposition 3.2 



Notice that if e"^ < \\x — x'\\ < e"% the corresponding cone-like sets for Y^ and Y^' 
in C^ intersect between the lines y = e""^ and y = e°'^. Therefore the covariance of 
the variables satisfies, writing i := ||x — x'\\, 

Je y \Je°'i^ y Ji/2 y 

2 A„„ 1/2 



> (^1 log 

By applying the same reasoning when e < ||a; — a;'|| < e"^, one obtains the following 
lower bound for the covariance 

'O, ii\\x-x'\\> e''\ 

(3.2) E[Y,Y,,] > { ^\ flog ^ - 0, if e^^ < \\x -x'\\< e^\ 

a\ (log ^ - l) + ol (log g - 1) , if £ < ||x - x'll < E^\ 



Equation (3.2) is used to construct a Gaussian field Y . Define the map vr 

TT : X^ — )■ A^E^i X X^a-i X A'g 
X -^ (7ri(x),7r2(x),x) 

where t^\{x) is the unique y G X^c^ such that ||x — y\ < ^; 7i2{x) is the unique 
y G XsC2 such that \\x — y\\ < ^ . (If||a; — y|| = ^, there are two possibilities for 
y. We take the right point). The pre- image of y G X^^^ under tti are exactly the 
points in Xg, that are at a distance less than ^ from y. One can think of 7ri(x) as 
the ancestor of x at the scale e"^ and tt2{x) as the ancestor of x at the scale e"^. 

Consider the following Gaussian variables 

{g^' , X G X^a^) i.i.d. Gaussians of variance a^ai logN — a\ log 2 — a\, 

(3.3) (fi'a; 5 X G ^£q2) i.i.d. Gaussians of variance (y\{a2 — o.\) log A^ — a\, 

{Qx iX G A'e) i.i.d. Gaussians of variance a\{\ — 0:2) log A^ + 2a\ + a^. 

These three families are also taken independent. Then, the field Y is defined, using 
the map vr above and the Gaussian random variables Qx-, by 

(3-4) >'x = ^i'U+^S.)+#- 



This construction and Equation (3.2) directly imply the following comparison lemma. 

Lemma 3.3. 

E[i;2]=E[F,2], VxGA",, 

ny^Yy\<¥.[YxYyl M x^y,x,yeX,. 

The following corollary is a straightforward consequence of the above lemma and 
Slepian's lemma, see Corollary 3.12 in [26] . 
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Corollary 3.4. For any A > 

(3.6) Pf maxK, > A ) <P(maxi; > A ) . 

\xex, J \xex, J 

The Gaussian field Y is almost identical to a GREM model with three levels with 
parameters < ai < a2 < 1 and (Ji,a2,(J3, see e.g. ^7\ HD]. In fact the only aspect 
different from an exact GREM are the terms of order one in the variances of the 
Gaussian random variables g-h s. However, these do not affect the entropy of the high 
points. In Case 1, the field reduces to a two- level GREM with effective parameters 
{Vi2/a2, cr|), (^2, 1) whereas in Case 2, the effective parameters are {af, V23/(l — ai)), 
{ai, 1). The proofs of Proposition 3.1 and of the upper bound of Proposition 3.2 is 
based on the following standard GREM result. A proof is given for completeness, but 
some details will be omitted. The reader is referred to Theorem 1.1 in pO] where a 
stronger result on the maximum is given and to [9j, Lecture 9, for more details on the 
free energy and on the log-number of high points of a two-level GREM. 

Lemma 3.5. Let Y be the Gaussian field constructed above. Then 
P ( maxn > y27max log A^ ) ^ 0, A^ ^ oo, 

\xeXs / 



where •ymax is defined in Proposition 3. 1 . Moreover, 



(3.7) lim i^^J^^%^ = £("•") (7) in probability, 

N^oo log A 



where £^'■'^'"•'(7) is defined in Proposition 3.2 



Proof. We only prove the Case 1, the reasoning in the Case 2 being similar. Consider 

'7ri(a::) '^tt2(X) 



the field {Yx{a2),x G Xs'^2) where Yx{a2) := g^ /^^ + gl (x)- Markov's inequality and a 



Gaussian estimate, see Lemma 5.1, yield 



(3.8) P max ^,(^2) > V2Vl^i2a2 log A^ ^0, A^ ^ 00. 

\xex^a2 J 

Define 

•hS(72,73) := {xeXe- ^.(«2) > v^72logAr , (^(3) > 7273 log AT}. 
Again, Markov's inequality together with a Gaussian estimate gives for 72,73 > 0, 

V V'12 cTi(l-^ ^,1- 
7273 log A^ 

Equation (3.8) implies that 1^^(72,73)! is zero with probability tending to one if 



T2 ,, T3 



F (|7/S(72,73)| > 1) < C ^'^^^f-^^^ AT^-^-^^Io^ 
V / -79 O'.'^ loe A 



72 > a/Vi2«2- Suppose < 72 < a/Vi2«2- Then, if 72 + 73 > 7max, the second 
parameter 73 must be greater than as{l — 0:2). Therefore P(| 7/^^)^(72, 73)] > 1) goes to 
0, when A^ tends to infinity, in the case 72 + 73 > Imax- This implies the first claim. 

For the second claim, we note first that there is a self-averaging of the log-number 
of high points: 

1-^ log 1^^(72,73)1 ^ j.^ log E|7^g(72, 73)1 ^ .^ probabihty. 

Af^oo log A/ AT-s-oo log A* 

This self-averaging holds under the two conditions on 72 and 73 imposed with high 
probability by the first part of the proof : 72 < ^/Vl2a2 and 72 + 73 < 'Jmax- This is a 
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straightforward computation using the second moment method and is done in Lecture 
9 in p. Note also that a Laplace-method argument yields 



li,„ -^^MM = lim ' 



max log I "Hat (72, 7 - 72 



Af-s>oo logA^ A'-S'cx. logA^ 72:|72|<v%2^ 

It remains to notice that, by linearity of the expectation, 

logE|?^J(72,73)| 



in probability. 



(3.9) 



logA^ 



1 



7l 



7| 



V12 (t|(1-«2) 



+ Ojv(l). 



For a given 7 = 72 + 73, the expression on the right in (|3.9|) is maximized at 



72 =7 



73 = 7 - 72 = 7 



a|(l - aa) 



^123 ^123 

If 7 < 7crif = ^1234/^, then 72 and 73 satisfy these conditions. Equation (3.9) 

evaluated at 72 and 73 equals £^'^'"(7). If 7 > 'jcrit, then (3.9) is maximized for 72 
tending to ^/Vi2a2 and £^'^'"(7) is again recovered. 



D 



3.2. Proof of Proposition 3.2| , Proposition 3.2 asserts that, for all p > 0, 

log|^]v(7)l 



P 



£('^,«)(^) 



>p -^0, 



A^^ 00. 



logA^ 

The proof is split in two parts, proving first that the upper bound P( 17^^(7)1 > 
]\[£ "■'" (7)+p^ converges to by comparing to the field Y constructed in the last sec- 
tion. Second, proving that the lower bound P(|'H^(7)| < A^^ "^'^ ('^)~'') decays to zero 
following the argument of Daviaud [16] . 



3.2.1. Proof of the upper bound in Proposition 3^. The first result is a comparison in 
the spirit of Corollary |3.4 



Corollary 3.6. Let T-Ljfij) = {x G X^ : Y^ > a/27 log A^} ^'^^ similarly for Y. For 
any M eN, 



(3.10) 



P(|^);(7)I>M)<P(|?/J(7)|>m). 



Proof. The proof is a again a consequence of Lemma 3^ and Slepian's lemma, see 
Corollary 3.12 in [26]. They imply that for any A^; G M, x G X^, 

(3.11) p(n > A,, X G x,^ > p(y; > A,, X G x,y 

The integer moments of \1-Cn{i)\ can be expressed as a linear combination of proba- 
bilities 

E[|^^(7)I1 = E ^{y., > v^7logiV,...,y., > v^7logAr) 

a;i,...,a;j.eA'e 

< Yl P(^x. > V27logA^,...,y;, >v^7logAr) =E[\'hU7)\% 

The corollary follows from the inequality for the moments because the variables 

\'Hjf{'y)\ and \'Hjf{'y)\ are nonnegative and bounded by A^. D 
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The proof of the upper bound in Proposition 3.2 can now be concluded. Let p > 0. 



Corollary 3.6 implies 



On the other hand, the right side goes to zero by Lemma 3.5 

iog\nUi)\ 



smce 



P(|HS(7)|>iV^'""'^^^+'') <P 



logA^ 



_^(--")(^) 



>P 



D 



3.2.2. Proof of the lower bound in Proposition 3.2 The proof of the lower bound is 
a finite recursive argument. Two lemmas are needed. The first is a generalization of 



the lower bound in Daviaud's theorem (see Theorem 1.2 or [IE]). 

Lemma 3.7. Let < ao < a < 1. Suppose that the parameter a is constant on the 
strip [0, 1]^ X [e:",^""'], and that the event 

So := {#{x e X,.o : Y,{ao) > v^7ologiV} > iV^o} , 

is such that 

P(SS)<exp{-co(logiV)2}, 

for some 7o > 0, £^o > and cq > 0. 

Let 

(7 - 7o)^ 



£{-f) := £o + {a- ao) 



>0. 



0-2(0; — ao) 
Then, for any 7 such that S{'~f) > and any £ < S{'y), there exists c such that 

P ('#{x e X,. : n(a) > 727 log A^} < A^^) < exp{-c(log A^)^}. 

We stress that 7 may be such that ^(7) < £o- The second lemma, which follows, 
serves as the starting point of the recursion and is analogous to Lemma 8 in [5]. 

Lemma 3.8. For any < a < ai, there exists E = £{a) and a = c{a) such that 
P (#{x G Xec, : Y,ia) > 0} < A^^) < exp{-c(log Ar)^}. 



We first conclude the proof of the lower bound in Proposition |3.2| using the two 
above lemmas. 



Proof of the lower bound of Proposition 3.2. Let 7 such that < 7 < 'jmax- Choose £ 



such that £ < £^"''"-\'y). It will be shown that for some c > 
(3.12) P {\'Hl{j)\ < N') < exp{-c(log Ar)2}. 



By Lemma 3^, for ao < «i arbitrarily close to and 70 = 0, there exists £0 
£^o(tto) > and Cq = Co(ao) > 0, such that 

(3.13) P (#{x G X,^o : Y,{ao) > 0} < Ar^°) < exp{-Co(log Ar)^}. 

Observe that we have < £0 < ao- Moreover, let 



(3.14) 



^1(71) ■= £0 + («i -«o) 



7? 



ajiai -ao)' 
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Lemma 3.7 is applied from oq to ai. For any 71 with ^1(71) > and any Si < £^1(71), 
there exists Ci > such that 

P (#{x e Xe^^ : K,(ai) > V2^i\ogN} < iV^^) < exp{-Ci(logiV)2}. 

Therefore, Lemma 3.7 can be apphed from ai to 0:2 for any 71 with £^1(71) > 0. Define 
similarly 

(72-71)^ 



(3.15) 



^2(71,72) := <^i(7i) + ("2 - tti) 



cr|(a2 - ai] 



Then, for any 72 with £^2(71, 72) > 0, and £2 < ^2(71, 72), there exists C2 > such that 
P (#{x G X,c, : Y,{a2) > V2^2^ogN} < iV^^) < exp{-C2(logiV)2}. 

Finally, the lemma is applied from 02 to 1 (where 71,72 are such that ^1(71) > and 
^2(71)72) > 0). Define 

(73 ~ 72)^ 

(3.16) S3{-fi, 72, 73) := ^2(71, 72) + (1 - "2) 271 7- 

^3(1 -"2) 

Then for any 73 with S^i^fi, 72, 73) > and S3 < ^3(71, 72, 73), there exists C3 > such 
that 



(3.17) 



P (#{a: eX,:Y,> y273logiV} < iV^^) < exp{-C3(logiV)2}. 



Recalling that < Sq < Oq, Equation (3.12) follows from (3.17) if it is proved that 
limQ,g^o^3(7i5 72,7) = £^"''°'\'y) for an appropriate choice of 71 and 72 (in particular 
such that ^1(71) > and ^2(71,72) > 0). It is easily verified that, for a given 7, the 
quantity (^3(71,72,7) is maximized at 

V12 - afao 



7i =7 



af{ai - ap) 



V, 



72 



7 



123 



^123 - 0-f "0 ' 



Plugging these back in ( |3.14[ ) and ( |3.15[ ) shows that £^1(7*) > and ^2(71,72) > 
provided that 

7 < and 7 < V"i23W 7^ =: 7c«i, 



0-1 



^2 
V^2 



with ao small enough (depending on 7). Note that the second condition on 7 implies 
the first since ui < o"2. Furthermore, since 

^-2 

^3(71,72,7) =^o + (l-ao) 



7 



V, 



123 



afcto' 



we obtain limQ,(,_>of3(7i,72,7) = £^'•'^'"''(7), which concludes the proof in the case 

< 7 < Icrit- 

If 7crji < 7 < 7max, the condition ^2(71,72) > will be violated as ao goes to 
zero (note however that fi(7i) remains positive). In this case, for i/ > 0, pick 72* = 
\/Vi2a2 — i' such that £^2(7*, 72*) > 0. The first term in 72* corresponds to 72 evaluated 
at 7crit for ao = 0. In particular, limQ,Q_^o,i^^o^2(7i,72*) = 0- From (3.16), this shows 
that 

(7 - ^Vi2a2f 



lim ^3(71, 72*% 7) = (1 - "2) - 



^1(1 



^2 



^('^'"^7). 



Note that S'^'^'°'\'y) is strictly positive if and only if 7 < \/Vi2Ci2 + 0-3(1 — ^2) 
This concludes the proof of (3.12). 
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Proof of Lemma 3_/l_. Let 7 such that £^(7) > and 8 such that < £^ < £^(7). Pick 
7 > 7 such that 

(3.18) ^(7) > ^ > 0. 
Since 7 > 7, there exists ^ G (0, 1) such that 

(3.19) 7(1-0 > 7- 
For K ^'R (which will be fixed later), we set 






"0 + 



7o + 



'-1 
-1 



K 



a-ao), l<i<K + l, 
(7-7o)(l-0, l<e<K + l. 



Observe that the 77^'s and the A^'s satisfy rji = ao < ri2 < ■ ■ ■ < tjk < Vk+i = «, and 
\i = 'Jo < \2 < ■ ■ ■ < Xk < ^K+i = (1 — ^)7 + ^lo- Consider the sets Ai given by: 



Af 



|x^^) = (xi, . . . ,Xi) : Xj G X2£v^ ,\fl<i<i and ||xj+i — Xi\\ < e'^'/2J , 

for 1 < i < K + 1. Note that only half of the Xj's in X^m^s are considered. Also, to 
each Xi we consider the points Xj+i in X2e-^i+i that are close to Xi. By analogy with a 
branching process, these points can be thought of as the children of Xi. The reason 
for these two choices is that the cones corresponding to the variables Yx^j^^irji^i) and 
Yx'_ j^iVi+i) do not intersect below the line y = e^^ if Xi ^ x[, see Figure 111 



£''» 



rVi+l 




-> <r- 



2£'l^+^ 



2£* 



Figure 4. Approximation by a tree-like structure. The black circles 
symbolize the children of the white circle, while the black squares sym- 
bolize the children of the white square. 

Now consider, the sets of high points of Af 

Ai := |x(^) G Ae : Y,Xv^) > V2K\ogN, VI < 2 < f} , I < i < K + I, 

and 

5, := {#A, > nj , l<i<K + l, 

where 



(3.20) 



Hi := N 



^ \ ^ ' CT^(a — ckq) 



1 <£<ir + l. 
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such that A^^" = rii and uk+i = N^^"'^ Furthermore, with these definitions and the 
choice of 7 in ( 3.19[ ) and (3.18), we have for large N 

Br+i = {#^A'+l > '^A'+l} 

C {#{x G Xe<^ : Y,{a) > y2((l - ^7 + ?7) logiV} > iV^(^)} 

C {#{x G Xe<^ : Y,{a) > V2^\ogN} > iV^} . 

It is thus sufficient to find a bound for P(i?^^^) to prove the lemma. For events C^ to 
be defined in ( [3^ , we use the elementary bound P(5^+i) < P(5^+i n Br n C^) + 
F{Ck) + IP(-B^) which applied recursively gives 

K+l 

(3.21) P(i?^^,) < J2 inB'i n i?£-i n cti) + P(C,-i)) - 



e=2 



The last term has the correct bound by assumption. It remains to bound the ones 
appearing in the sum. 

On the event B^, there exist at least n^ high ^-branches x^^^ = (xi, . . . ,X£), these 
are branches that satisfy Yx.{r]i) > \/2\i\ogN ioi 1 < i < i. Select the first n^ such 
^-branches and denote them by x) = {xji, . . . , x,-. e), for all 1 < j < nn. Consider the 



set Aj^t, the children of Xj^t at level r/^+i: A 
It holds 



■i/ 



{x^X. 



2e''^+i 






< £''72} . 



ni 



B,r\Bl^, c B,r}lY,Y.^ 



ni 



C B,nlJ2^: 



2n£+i 



< ^£+1 



.i=i 



■^ — jV("~"o)/ 



A" 



where 

(3.22) 



0^= 



lA- 






Y4Vi+i)-Y.^j{Vi)>V2 ^''~'"i^'''^ log Ny 



and l^j/l = A^(°'~"o)/^/2. A crucial point is that Yx-Xv^) is not equal to Yx{ri£) since 
X 7^ Xj^i in general. However, it turns out that their value must be very close since the 
variance of the difference is essentially a constant due to the logarithmic correlations. 
Precisely, let 

(7-7o)(l- 



(3.23) Q:= U U 



\\x-xe\\<e'^i /2 



Y^,{r]e)-Yx{r]e)\ > V2u- 



K 



logN 



for z/ > which is fixed and will be chosen small later. By Lemma [5^ of the Appendix, 

Var(Y'a;(?7£) —Y^'irji)) < max{crf, cr|, al} < 00, for every 1 < d < K, and any x G X2£^t, 
x' G A'2e'j<+i such that \x' — x\ < e^^ jl. Therefore, a standard Gaussian estimate, see 
Lemma |5.1 , together with the union-bound give 

(3.24) P(Q)<exp{-rf(logAr)2}, 

for all 1 < £ < K and some (i > 0. 
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It remains to bound the first term appearing in the sum of (3.21). On CI, Yx^{ri() 
can be replaced by Yx{rii) in (3.22), making a small error that depends on v. Namely, 
one has C,j > Q, where 







1^. 






y.fa+i)-y.fa)>V2(i+'^) ^^'^°i""'^ i°gA^}" 



Note that conditionally on J^evt, the (/s are i.i.d. Moreover, since the C/s are inde- 
pendent of J-'sve , they are also independent of each other. Lemma 5^ of the Appendix 
guarantees that the sum of the Q cannot be too low. Observe that 

,(7-7o)(l -^) 



E 







Fiz> 72(1 + 



K 



logN 



2 (a-«o) 



where z is a centered Gaussian with variance cr^log(jl^) = a"^ ^" j^"' log A^. By a 

1 (l + 2z/)2(7- 70)^(1-'^)^ 



Gaussian estimate, Lemma |5.1 

E 







> exp 



K 



logN 



where (1 + u) has been replaced by (1 + 2i/) to absorb the l/^/[ogN term in front of 
the exponential. Consequently, using elementary manipulations. 



B'^^.nBenC'^ c 



c 



ni 

E(6 



E 



E 



Q 



< 



2np 



J\^{a-ao)/K 



-neN~ 



ct^(q,-q,0) 



1 1_ (1 + 2'-)%-70)^(1-t)^ 

" 0-2(q!-Q!q) 



> -neN ^' 
- 2 



1 (1 + 2^)^(7-70)^(1-0^ ^ 1 (7-7o)^ 



K 



provided 

that is 
(3.25) 

Fix u small enough such that (3.25) is satisfied. Write for short 

1 (l + 2z/)2(7-7o)2(l-,)2 



a^[a — ao) K a'^{a — ao) 

(l + 2z/)(l-0 <1. 



/i 



K 



a^[a — ao) 



Then, taking n = n^ and t = n^N ^ in Lemma |5.2[ we get 

P(5^^, 1 n 5^ n Q) < 2 exp 



n: 



iV"2'^ 



2n£ + ^n^N-^ 



By the form of n^ in (3.20), K can be taken large enough so that riiN ^^ > N^ for 
some 6 > and all £ = 1, ..., A' + 1. This concludes the proof of the lemma. D 



Proof of Lemma 3^ . Take a' < a in such a way that X^a' C X^a . Consider the set 
A := {x G X^c' : Y^{a') > -ai{a - a')\ogN} , 

and the event 

A = ^:= {|A| > A^^}, 6>0. 
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The parameters £, 5 and a' will be chosen later as a function of a. By splitting the 
probability on the event A, 

P(#{xGA',. :n(a)>0}<iV^) 

< p (#{x G X,c. : Y^{a) > 0} < iV^; A) + P(A^) 

< E [P (#{x G A : n(a) - Y^{a') > a^{a - a') logiV} < N' \ J^^^^) ; A] + P(A^), 

where the second inequality is obtained by restricting to the set A C X^a. 

First we prove that the definition of A yields a super-exponential decay of the first 
term for £ and 5 depending on a — a' . The variables Yx{a) — Yx{a'), x G X^^' , are 
i.i.d. Gaussians of variance al^a — a;')logA^. Write for simplicity {zi^i = 1,...,N^) 
for i.i.d. Gaussians random variables with variance af{a — a')logN. A standard 



Gaussian estimate (see Lemma 5.1 of the Appendix) implies 

1 p-lia-a')logN 

P {z, > aM - a') log AT) > - . > e-5(«-«')iogA^. 

^ a/ (a — a') log A* 

Therefore 

E [P (#{x G A : Y,{a) - Y,{a') > a,{a - a') log A^} < A^^ | J"^.,) ; A] 

< P 5^ (l{.,>.i(a-a')iogiv} - P (^. > cr.ia - a') logN)) < N' - iV^-|("-°') 



Lemma 5.2 in the Appendix gives a super-exponential decay of the above probability 
for the choice 6 > |(a — a') and S — 6 + ^{a — a') < 0, for example 6 = 2(a — a') and 
S = a — a'. 

It remains to show that P(A'^) has super-exponential decay. We have 

F{A^) < P{A', max Y^{a') < (hgNf) + P{ max Y^{a') > (log A^)^). 

The second term is easily shown to have the desired decay. We focus on the first. On 
the event A'^ n {max.e;^,, r,(a') < (log Ar)^}, 

1 >r^ , . 1 






(3.26) "^'^"' "^^ 



X 



< ^(logAr)2 + (l - ^) (-ai(« - a') log AT). 



Since \X^c' \ = N'^ , it is easily checked that for 6 = 2{a — a') < a', the above is smaller 
than — |o"i(a — a') log A^. Therefore we choose a' such that a < 3a'/2. Finally the left 



side of (3.26) is a Gaussian random variable, whose variance is of order 1. Therefore 



the probability that it is smaller than — |cri(a — a') log A^ is super-exponentially small. 
This completes the proof of the lemma. D 

4. The free energy from the high points: proof of Proposition 12.11 

In this section, we compute the free energy of the perturbed models introduced 
in Section 



2.1 



The free energy /^'° (/?) is shown to converge in probability to the 
claimed expression. The L^-convergence then follows from the fact that the variables 
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ifNW))N>i are uniformly integrable. This is a consequence of Borell-TIS inequal- 
ity. (Another more specific approach used by Capocaccia, Cassandro and Picco 
for the GREM models could also have been applied here, see Section 3.1 in 
Indeed, we clearly have 

log N log N 

Therefore, uniform integrability follows if it is proved that ,^^ ^^.2 ^[{ Kiax^gA'iv '^x) ] 
is uniformly bounded. It equals 

— ^E[( max Y, - E[max yj)'] + rrr^Eimax F,]l 

(logA^)^ '-^xeA'iv xeXN ' -' (logiV)^ xeA'jv 

The second term is uniformly bounded by comparing with i.i.d. centered Gaussian 
random variables of variance yi23logA^ and using Slepian's inequality (see e.g. pQ, 
page 57). For the second term, we use Borell-TIS inequality (see [T], page 50) 

pfl maxYj. -Emaxy^.| > rW 2e^2Vi23to^, Vr > 0, 

to get 



/max^e;^^ Y^ - E[max^.e;t'Ar Yx^ ^ ^ 



< 4 / re 2^123 dr. 



E , 

V log^ / J ^0 

which goes to zero for A^ — )■ oo. The almost-sure convergence is straightforward from 
the L^-convergence and the almost-sure self-averaging property of the free energy: 

lim |/(r'"^(/3)-E/ir'°)(/3)| = 0, a.s. 

This is a standard consequence of concentration of measure (see ^0], page 32) since 
the free energy is a Lipschitz function of i.i.d. Gaussian variables of Lipschitz constant 
smaller than /3/\/IogiV. (Note that the Y^-s, can be written as a linear combination 
of i.i.d. standard Gaussians with coefficients chosen to get the correct covariances.) 

It remains to prove that the free energy /jv^'" (/?) converges in probability to the 
claimed expression in Proposition 2A For fixed /3 > and z/ > 0, we prove that 

(4.1) hm P (A"'"H/3) < /('^'")(/3) - ^) = 0, 

(4.2) limP(/(,^'")(/3)>/^"'"n/3) + ^) = 0. 

First, we introduce some notations and give a preliminary result. For simplicity, we 
will write E for S^"''"^ throughout the proof. For any M G N, consider the partition 

of [0,7maa;] i^to M intervals [7j_i,7j[, where the 7j's are given by 

i 

Moreover for any N > 2, any M G N and any 6 > 0, define the random variables 
K,m(') ■= *[^^Xn ■■ ^^^ ^ e [7.-1,74} , l<^<M, 

K^M") ■= #{^^-y^ ^ - ^^^ ^ e [7^-1,74}. l<^<M, 
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and the events 

M 



fl |#{x G Xn : |n.| > V2imaJogN} = 0} 



The next result is a straightforward consequence of Proposition |3.1[ Proposition |3.2 
and the fact that Gaussian random variables are symmetric. 

Lemma 4.1. For any M G N and any 6 > 0, we have 

hm P {B^,,^s n ^^.M,.) = 1- 

Define the continuous function 

Pfsil) ■■= ^(7) + v^/37, V7 G [0, 7^„J . 

Using the expression of S in Proposition |3.2 on the different intervals, it is easily 
checked by differentiation that 

(4.3) sup P^(7) = /(-'-)(/3). 

7e[0,7maa;] 

Furthermore, the continuity of 7 t-)- P/sij) on [0,7maa;] yields 

niax P^(7,)— > sup P^ (7) = /(-'") (/3), M -> 00. 
0<2<Af-i -vprn-v 1 

Fix M ^N large enough and (5 > small enough, such that 



(4.4) 

(4.5) 
(4.6) 



^max_^P,(7.) > /(-)(/?) -^, 

V2/3 V 

~M~ ^ 3' 



5 < min<! -- mill {^(7i)-f(7i_i)},- 

Z l<i<M o 



Note that for fixed M, mini<j<Af{£^(7i) — £^(7i_i)} < since 7 i-> £^(7) is a decreasing 
function on [0, 7^^^] . 

Proof of the lower hound (4-1)- Observe that the partition function Z" {(3) associ- 
ated with the perturbed model satisfies Z];^'"'(/3) > 'Yl,i=i ^n Af (0^ '~^^- Therefore 
on 5+ ^^5 we get 

M 



i=l 



This yields on B 



N,M,S 



, , Inp-ri — A^™™l<i<M{£(7^)-^(7i-l)}+2<5^ 



Since for 5 in (4.6) 



lim (log A^)~^ log(l - Ar'°i^i<'<«i^(7.)-^(7.-i)}+25^) ^ g^ 



Af-5-oo 



the choices of M, 5 in (Q and Q) give that f}^'"'{/3) - /('^'")(/3) > -u on B 



for A^ large enough. Therefore, (4.1) is a consequence of Lemma 4.1 



N,M,5 
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Proof of the upper bound (4 -2). 
Observe first that the partition function Zn{(3) satisfies on Bj^j^^g fl -B^vm^ 

M M 

i=l i=l 

Since on B~^ ^ ^ fl B]^ j^^ ^ the random variables K^ j^j(i) and Kjj jyi{i) are less than 
jY^{7«_i)+5 fQ^ g^u 1 < i < M, this yields 

M M 



i=l 



i=l 



Therefore, we get 



log(2M) 
logiV ^e[o,7^„.] 



/jv(/3) < TA";7 + sup P^(7) + ^ + <5, 



M 



on B^ jy.j g n Bj^j^jg. Recalling (4.3) and since lim7v^oo(log A^) Mog(2M) = 0, the 
choices of M and 5 in Q and (4.6 ) imply that /{f '"^ (/3) - /('^'") (/3) < z/ on 5+ jv/,5 ^ 
^NMS ^°^ ^ large enough. Therefore (4.2) is a consequence of Lemma 



4.1 



D 



5. Appendix 



5.1. Gaussian estimates, large deviation result and integration by part. 
Lemma 5.1 (see e.g. i21j). Let X be a standard Gaussian random variable. 
(1) For any a > 0, 



\X\ >a)< e^^ /\ 



(2) For any a > 1, 



\X\ > a)> 



-aV2 



2'Ka 



(3) Moreover we have the following approximation for a large 



:i-2a-^) „2,o „,„ . 1 



2'iTa 



[^-a^/2 < p(^ >a)< -^e 



-a2/2 



2'Ka 



Lemma 5.2 (see e.g. [4j ). Let Zi,...,Z„ be i.i.d. real valued random variables 
satisfying E[Zj] = 0, a^ = IE[^f] and ||^j||oo ^ 1- Then for any t > 0, 



P 



E2. 



>t\ < 2 exp 



2n(j2 + 2t/3 



Lemma 5.3 (see e.g. Appendix of (SO])- Let (X, Zi, . . . , Z^) &e a centered Gaussian 
random vector. Then, for any C^ function F : M'' i— )■ M, o/ moderate growth at infinity, 
we have 

E[XF(Zi,...,Zrf)] = ^E[XZ,]E 



dzi 



{Zi, . . . ,Zd) 



POISSON-DIRICHLET STATISTICS AND LOG-CORRELATED GAUSSIAN FIELD 27 



5.2. Proof of Lemma [2T2| Recall that <e = 1/N < 1/2, and t,6 e (0, 1) is such 
that t + 6 < 1. Also by definition, ||a;' — a;|| = e''^^'^'\ 

It is clear that E[Xa.X^] = EllX^)"^], which is the variance of the centered Gaussian 
random variable fi{A^t+6{x) \ Ai;t[x)). This variance can be computed and equals 



/ 2/^^dy = [log y]ll+s = S log A^. 



For the covariance, observe that E[X2.Xa,/] is equal to the variance of the random 
variable n{{A^t+6{x) \ A,t{x)) n A^{x')). If e < £ := \\x' - x|| < 6*+'^ (i.e. t + 6 < 
q{x,x') < 1), then the subsets Ai;{x) and Ae{x') intersect below the line y = e{t + 6) 
thus, the covariance is given by 



r-e'- " ^ . -, e 



Jet+s y 



v 



5\ogN + 0{l). 



^t+S 



If £:*+^ < a = \\x' — x\\ < e* (i.e. t < q{x,x') <t + 6), then the subsets intersect in 
between the lines y = e^^^ and y = e^, thus 



E[X,X,,] = r ^—^dy = [logy]f + 

J t y 



{q{x,x')-t)hgN + 0{l). 



'1' 

y. _ 

Finally if £ = ||a;' - x\\ > e^ {i.e. < q{x, x') < t), then the set {A^t+s{x) \ Aet{x)) n 
As{x') is empty and thus E[X^Xa;'] = 0. D 



5.3. A key property of the perturbed models. The following lemma is a key 
tool to approximate the Gaussian field we consider by a tree. Indeed the difference 
between the contribution to the Gaussian field at a certain scale for two points that 
are close can be explicitly computed by integrating parallelograms, see Figure |5] below, 
and is shown to be small. 



Lemma 5.4. Fix a,ao as in Lemma 3.1, u such that a^ < u < a and 6 G (0,1). 
Then for all x, x' G A'^ such that \\x — x'\\ < Se^, we have 

where a denotes an upper bound for the (Ji 's. 
Proof. Writing A := Ai,u[x)AA^u{x'), we have 

Var {Y^{u) - Y^'iu)) < a^ y-'^dsdy = 2a^\\x - x'|| / y-'^dy 

= 2^2^ ^<2a^(5, 

which concludes the proof of the lemma. D 
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